Instrumental variable approach on analyzing risk factors associated with noncommunicable disease prevalence in Tanzania: A nonexperimental design

Abstract Background and Aims Noncommunicable diseases (NCDs) have emerged as a substantial burden in developing countries, representing the leading cause of mortality. Addressing this critical issue necessitates effective interventions and policy measures. Therefore, this study aims to investigate the risk factors associated with NCD prevalence in Tanzania. Methods This study employed a nonexperimental research design due to its ability to analyze secondary data without altering variables. The used data set of the study was sourced from National Panel Survey 2020/21 and Household Budget Survey 2017/18. The econometrics analysis applied in the study include two‐stage residual inclusion (2SRI) and control function approach due to their ability to suppress endogeneity and enhance the clarity of results. Results The findings indicate a significant positive correlation between alcohol consumption (0.4110382, p = 0.02), cigarette smoking (0.3354297, p < 0.001), and NCDs, emphasizing the urgency of targeted interventions to mitigate these behaviors. Conversely, a negative correlation is observed between fruit and vegetable intake (−0.1063375, p < 0.001), physical exercises (−0.3744925, p < 0.001), and NCDs, underscoring the importance of promoting healthy dietary habits and frequent exercises. Conclusion These results accentuate the immediate need for targeted interventions and policy measures to address these risk factors and effectively combat the escalating burden of NCDs in Tanzania and similar contexts. Moreover, the need for improved public awareness campaigns and the promotion of healthy life campaigns are vital in the fight to lower the prevalence of NCDs across communities.

In a world with limited resources, prioritizing becomes a crucial endeavor, extending to the field of public health. 1Together with an alarming prevalence of noncommunicable diseases (NCDs), the contemporary global landscape is witnessing an unprecedented rise in the elderly population.This context highlights the significance of risk reduction, NCD prevention, and healthy ageing promotion strategies.Over the past few decades, there has been a substantial expansion in the study of dementia risk factors.[3] NCDs pose a formidable threat to global health, accounting for 74% of global deaths, with 86% of premature deaths occurring in low-and middle-income countries. 4,5The trajectory of African public health initiatives has primarily centered on communicable diseases.
However, an alarming increase in premature mortality and disability due to NCDs and mental health conditions requires a shift in focus.
The proportion of total DALYs attributable to NCDs in Sub-Saharan Africa increased from 18% to 30% between 1990 and 2017. 6This transition to a "triple burden" scenario, which includes communicable diseases, NCDs, and injuries, poses a difficult challenge.
Tanzania is not immune to the looming shadow of lifestylerelated NCDs.The 2022 Noncommunicable Disease Progress Monitor report revealed that NCDs account for 34% of annual Tanzanian deaths or an average of 110,600 deaths annually.
Concerning 17% of premature mortality risk is attributed to NCDs. 4,6forts by the Tanzanian government and global organizations, exemplified by the World Health Organization, have yielded numerous strategies aimed at public awareness, improved healthcare infrastructure, and targeted policies to address lifestyle factors that contribute to NCDs. 7,8spite these commendable efforts, the attainment of NCD mitigation objectives remains incomplete, primarily due to a lack of emphasis on NCD mitigation on a national scale.This gap is fueled by a limited understanding and empirical limitations in identifying immediate risk factors and holistic repercussions of NCDs ranging from individual households to the national economy.It emphasizes the need for extensive, rigorous scientific research into the intricate complexities of NCDs.Mayige et al., 9 Andrew, 8 Kitole et al., 10 and Kitole et al. 7 have laid the groundwork; however, additional research is required to decipher the complex web of variables influencing NCD prevalence and its far-reaching consequences.
The diverse array of risk factors associated with NCDs has been illuminated by empirical research.Andrew 8 and Kitole et al. 4 highlight the prevalence of unhealthy lifestyles that contribute to the escalation of NCDs.Significantly, social interactions foster the imitation of behaviors that influence the emergence and prevalence of NCDs.Unwin et al. 11 note that unregulated habits, such as excessive or insufficient consumption, are significant contributors to reported cases of NCD.According to the World Health Organization, NCDs will account for approximately 77% of deaths between 1990 and 2020 due to urbanization and lifestyle changes.
In addition, studies highlight dietary habits, smoking, alcohol consumption, and insufficient physical activity as key risk factors for NCDs. 12Overconsumption of salt, calories, and saturated fat exacerbates the burden of NCDs. 13The correlation between excessive salt consumption and hypertension, which results in 59% of deaths in low-income countries, demonstrates the gravity of the problem. 14Cholesterol levels are exacerbated by insufficient physical activity, which increases the likelihood of NCDs. 15,16e impact of NCD risk factors differs by country and cluster.
8][19] Sex, age, marital status, household size, and ethnicity are dynamic contributors to NCD prevalence. 20,21Hypertension exemplifies hereditary characteristics contributing to familial susceptibility. 22,23Diverse factors, including physical activity, diet, toxins, and urban stress, can explain urban-rural disparities in NCD prevalence. 24,25Nevertheless, comprehensive analyses frequently overlook social interactions and hereditary factors, highlighting the importance of the present study's investigation.
The present study incorporates unobserved effects of social interactions and hereditary components to examine their implications on NCD risk factors across individuals' locality and family backgrounds.Thus, the study introduces social membership engagement/ participation as a proxy for social interactions, shedding light on the complex dynamics that influence health outcomes.

| METHODS AND DATA
This study used a nonexperimental research design because of its inherent capacity to utilize secondary data from the Tanzania National Bureau of Statistics.The data sets consist of the National

Key points
• While the global share of noncommunicable diseases (NCDs) deaths is uncontrollable, the disease accounts for 34% of total deaths in Tanzania.
• Adequate physical exercises, fruit and vegetable intake protects against NCDs, while alcohol, cigarette smoking, and NCDs have strong positive correlations.
• Urban residence, age, and sex vary in their correlation with NCD risk, highlighting the complexity of NCD determinants.
• Poor healthcare infrastructure makes NCDs more vulnerable, emphasizing the need to improve it.
• Public health campaigns, health education, healthcare system strengthening, and evidence-based policies are needed to reduce NCD prevalence, promote healthier lifestyles, and reduce risk factors.
Panel Survey Wave 5 of 2020/21 and the Household Budget Survey 2017/18, combined to enhance clarity and incorporate key insights for elucidating the prevalence of NCDs across diverse communities in Tanzania.These data sets contain abundant socioeconomic and health-related information pertinent to the study's objectives (see Table 1).
In contrast, previous research efforts, such as those by Andrew, 8 Kitole et al. 4 in Tanzania, and Mwai and Muriithi 14 in Kenya, were limited by their reliance on single survey data sets, thereby limiting the examination of essential household-level characteristics.Notably, the study focused on the binary measurement of NCD status.In this case, the dummy variable was assigned to households that reported any major NCDs within the previous 12 months.Figure 1 provides additional geographical context by depicting a map of Tanzania's study area.
In tandem, the study utilized the analytical prowess of STATA 17.0 for coding, data management, and rigorous statistical analysis.
The approach to analysis included both descriptive and inferential statistics.Through tabulated frequencies, means, and standard deviations, descriptive statistics skillfully portrayed the fundamental attributes and characteristics of the data set.Collectively, these analytic foundations bolster the exhaustive examination of the research questions and provide a solid and methodologically sound framework for the investigation.
In contrast, the instrumental variable models of two-stage residual inclusion (2SRI) and control function approach (CFA) were used to analyze the risk factors for NCDs, where risks factors, particularly income, were regarded as an endogenous variable as it explains other risks factors like cigarette smoking and alcohol consumption, therefore to suppress endogeneity, the distance to the nearest water sources was used as an instrument.
The used models (2SRI and CFA) assume that the relationship between the endogenous variable and the dependent variable is nonlinear and that a confounding variable (distance to a water source) affects the endogenous variable but not the dependent variable directly. 5,26,27e general structural equation that explains the relationship can be written as; Where Y is the NCD status, X is the endogenous variable (income), Z is the instrumental variable (distance to the nearest water source), and β 0 , β 1 , and β 2 are the model parameters, ε is the error term, φ is the cumulative distribution function of the standard normal distribution.
The relationship between the instrumental variable (Z) and the endogenous variable is estimated using the first-stage regression.The estimated coefficient of the instrumental variable (β 2 ) is used to calculate the expected value of the endogenous variable (X ˆ), which is then included as an independent variable in the second stage of regression with the dependent variable (Y) and other control variables (X).
The first stage regression is expressed as; Whereas Y 0 and Y 1 are the first-stage regression parameters, and δ is the error term.The predicted values of the endogenous variable The second stage regression is then rewritten as; Whereas W represent a vector of control variables, α α α , , 0 1 2 , and α 3 are model parameters, and μ is the error term.
The results of the instrumental variable models provide estimates of the influence of the endogenous variable on the dependent variable while controlling for the confounding variable (distance to a water source) and other control factors (W ).The validity of the instrumental variable model is based on the premise that the instrumental variable (distance to a water source) is significant and exogenous, meaning that it influences the endogenous variable but not the dependent variable.In addition, the model implies that no unobserved confounding variables influence both the endogenous and dependent variables.
Furthermore, the presence of endogeneity concerns is determined not by the researchers' subjective opinions, but rather by rigorous statistical tests that provide justification for employing instrumental variable models.As shown in Appendix 1, the null hypothesis of exogeneity was rejected in favor of endogeneity p ( = 0.005) emphasizing that there is endogeneity in the model.In testing the instrument's validity, which is the distance to the water source, the instrument was chosen based on its ability to satisfy key conditions: it has causal effects on exogenous variables, affects the outcome only through exogenous variables, and lacks confounding effects.Appendix 2 presents summary statistics for the first-stage regression of the distance to the water source.The R 2 value of 0.6537 (indicates that this variable explains about 65.37% of exogenous variables' variation, with an adjusted R 2 value of 0.5621).The partial R 2 value of 0.5388 suggests a significant relationship between distance to the water source and exogenous variables, supported by the p-value of 0.0000.These findings affirm the instrument's strength.
The limited information maximum likelihood method was employed, with Appendix 3 showing the instrument's evaluation.  2 show that most of the NCD victims are females (60%) while the affected males are just 40% of entire NCDs.Furthermore, findings in Table 2 show that nonmembers in social groups are highly affected by NCDs compared to members.This justifies that social groups can help members get information on diseases and reduce the chances for members to be affected.
The study's findings in Figure 2 show that the Dar es Salaam region (red-shaded) had the highest prevalence of NCDs in Tanzania at 18.4%.Dar es Salaam is a famous trading city in Africa which is highly urbanized in the East African region and contributes more than 70% of the entire GDP in Tanzania.
Therefore, the higher prevalence of NCDs justifies that NCDs hit more urban residents; this is why NCDs have been termed as urban diseases.
Moreover, findings in Table 3 show that the proportion of female-headed households was 52% while that of males was 48%, of which 60.9% of entire households resides in rural areas.In comparison, the remaining 39.1% reside in urban areas.In addition, only 29.4% of entire households consume vegetables and fruits regularly, while the majority 70.6% do not consume vegetables and fruits regularly.It indicates that most households do not tend to consume vegetables and fruits during their meals.Vegetables and fruits are rich in vitamins and important minerals for body repairs, growth, and development.
On the other hand, the proportion of heads of households that consume alcohol was 45.9%, while those who do not consume were just 54.1%, while the proportion of cigarette smokers was 47.7%.
These findings indicate that most heads of households in Tanzania are The map of Tanzania showing regions.
T A B L E 2 NCDs across demographic and biological factors.F I G U R E 2 Prevalence of noncommunicable diseases (NCDs) among regions in Tanzania, (n = 11,707).Source: Author's computation, 2023.
engaging in the consumption of commodities that endanger their health toward NCDs.
In addition, findings show that most heads of households (50.9%) have attended primary education as their highest level; those with secondary education were 28.00%, college was 10.75%, the university was 3.90%.In contrast, those with no educational background (not schooled) were 6.45% of the population under the study.Education information shows household knowledge and awareness toward diseases, especially NCDs.Kitole et al. 5 argue that the fact that NCDs develop slowly and emerge to be critical with time when people have no sufficient educational background and the ability to detect early symptoms increases the disease's prevalence.
In cementing these arguments, Oyebode et al. 28  The analysis introduced the interactions to address the heterogeneity between the residuals and their respective endogenous variables within the model by incorporating additional variables while estimating the structural equation.Consequently, interaction terms were established for variables such as alcohol consumption and fruit/ vegetable intake about their corresponding residuals.Conversely, the coefficient for cigarette smoking, along with its interaction term and residuals, was found to be insignificant.It suggests that heterogeneity may not be a significant concern when considering the interaction between cigarette smoking and its associated variables.However, significant interaction terms were observed for vegetable intake and  Furthermore, the findings presented in Table 4 indicate that alcohol consumption and cigarette smoking have a positive and significant correlation with the NCDs indicating that increased consumption of alcohol and cigarette smoking causes the development of the NCDs.These findings are consistent with studies conducted by Ahmed et al. 29 and Dalal et al., 30 which emphasize the strong link between NCDs in developing countries and behavioral risk factors.On the other hand, the consumption of fruits and vegetables is found to have a negative and significant correlation to the NCD, implying that sufficient intake or consumption of fruits and vegetables helps reduce the NCD development rate across households.2][33] Similar findings were also observed by Mwai (2014) in Kenya, who suggested that low vegetable and fruit consumption contributes to a 41% increase in the likelihood of developing NCDs.
Furthermore, Table 4 presents the correlation between proxy variables, including average cigarette smoking, alcohol consumption, fruit and vegetable intake, and the prevalence of NCDs.A rise of 1% on average cigarette smoking and alcohol consumption is associated with a corresponding increase of 1.888% and 0.864% in the likelihood of NCD occurrence, respectively.These findings suggest that smoking and alcohol consumption are influenced by social factors, such as neighborhood and peer influences, which heighten the risk of NCDs within households.These results align with previous studies conducted by Suls and Green, 34 Larsen et al., 35 and Caudill and Kong, 36 which highlight the impact of social imitation on alcohol consumption and cigarette smoking behaviors.They support that positive social interactions are persuasive, encouraging healthier behaviors like consuming nutritious foods and a well-balanced diet.
The study also establishes the relationship between demographic characteristics (sex, years of schooling, age, and area of residence) and NCD risk factors.Findings show that being in urban has a positive and significant correlation to the development of NCDs compared to their rural counterparts.These findings support the findings of Tawa et al., 25 that urban residences exhibit a higher degree of risk factors contributing to NCD development.
Furthermore, the study finds that ageing in Tanzania has a positive correlation to NCDs, implying that older populations are more prone to NCDs than the young population.These results differ from those of Barikdar et al. 37 and Dalstra et al., 38 who found that youths are more likely to get NCDs due to their lifestyles, primarily on excessive alcoholic consumption and insufficient time for physical exercise.On the other hand, although sex was not significant, it was found to have a negative correlation to the NCDs, implying that being male reduces NCDs compared to female counterparts.However, there is no consensus on whether sex type can influence the early development of NCDs.Yet, studies by Tawa et al., 25 Lima et al., 39 and Taylor et al. 40 found that heart diseases develop more quickly among females than males.
Additionally, limited access to health services and low health promotion contribute to the vulnerability of households to NCDs. 10 These arguments correlate with the findings of this study which show that as healthcare facilities are distant from household residents, it increases the chances for households to be endangered by diseases.The reason is the failure to get immediate healthcare consultation for disease symptoms, leading to more critical conditions or severe illnesses.

| CONCLUSION
The findings indicate noteworthy associations between NCDs and various socioeconomic characteristics.Hereditary factors were also found to significantly influence the presence of NCDs, underlining the role of genetic predisposition in shaping health outcomes across diverse socioeconomic backgrounds.Additionally, alcohol consumption and cigarette smoking displayed statistically significant positive relationships with NCDs, emphasizing the adverse impacts of these practices on health.
On the other hand, consuming fruits and vegetables exhibited a significant negative correlation with NCDs, suggesting that incorporating these nutrient-rich foods into one's diet may reduce NCD risk, particularly among individuals with varying socioeconomic profiles.Furthermore, the study revealed that engaging in physical exercises had a negatively significant coefficient, indicating its potential to mitigate the risk of NCDs, highlighting the importance of an active lifestyle in promoting better health outcomes.From these findings, to lower the prevalence of NCDs, the following are recommendations: Improved public awareness campaigns: A thorough public health campaign strategy that aims to raise awareness of the adverse effects of alcohol consumption and smoking should be developed and implemented.
This program should target both urban and rural populations, highlighting By implementing these suggested policy changes based on the above recommendations, a concerted effort can be made to address the risk factors linked to NCDs, setting the course for reducing their prevalence and easing the burden they place on society.In the ongoing struggle against the scourge of NCDs, these initiatives can have a profound and lasting impact when coordinated effectively at both the local and global levels in Tanzania.

| LIMITATION OF THE STUDY
Even though this study offers insightful information about the connection between risk factors and NCDs, it is important to recognize some limitations.Despite efforts to address endogeneity and heterogeneity using appropriate regression techniques, there may still be unobserved variables or factors that affect the relationship between risk factors and NCDs.The validity of the estimated effects may be impacted by omitted variable bias.
However, the study's cross-sectional design makes it difficult to prove a connection between risk factors and NCDs.Experimental or longitudinal studies would offer more decisive proof of causality and the temporal sequence of the observed associations.
There is a risk of recall or social desirability bias when factors like alcohol consumption, smoking habits, and dietary intake depend on self-reported data.Participants might under-or overreport these behaviors, making the results inaccurate.The study might not have considered all external influences or confounding variables that could affect the link between risk factors and NCDs.These unmeasured variables might introduce bias and impact how the results are interpreted.
An in-depth comprehension of the study's scope and implications requires an awareness of these limitations.Future studies should address these issues and offer more details on the intricate connection between risk factors and NCDs.

| 7 of 12 alcohol
consumption, indicating the presence of heterogeneity resulting from the interaction between the analyzed endogenous variables and unobserved risk factors for NCDs.The CFA estimates reveal that the coefficient of income is positive, indicating a positive correlation between income and NCDs.It implies that households with higher incomes are at greater risk of developing NCDs compared to those with lower incomes, although the likelihood of NCD occurrence increases at a decreasing rate as income rises, implying that the risk of NCDs is lower among the wealthiest individuals.

F I G U R E 3
The alcohol consumption (n = 11,810) and cigarette smoking (n = 12,273) across age cohorts.
Definition and measurement of variables.
variable.The Sargan test and Basmann test in Appendix 4 exhibit high p-values with the test statistics of 35.0042 p ( > 0.99) and, 28.6433 p ( > 0.99), respectively, suggesting no evidence to reject the null hypothesis of invalid instruments.Thus, these tests support the validity of the instruments used in the analysis.T A B L E 1 Dummy "Urban = 1 and 0 = otherwise" Findings in Table 2 show that 8551 out of 25,730 people, equivalent to 33%, are suffering from NCDs, while only 67% are not suffering from these diseases.Moreover, results stipulate that 2708 (32%) out of 8551 victims of NCDs are hereditary, while 5843 (68%) are the result of nonhereditary factors, which include lifestyles.The prevalence of NCDs based on locality has shown that only 3264 (38%) of the victims live in rural areas while the majority (62%) live in urban areas.Additionally, findings in Table T A B L E 3 Social demographic characteristics on various parameters (n = 25,750).
T A B L E 4 Contribution of specific risk factors to NCDs prevalence in Tanzania.facilities and improving access to medical care, especially in remote and underserved areas, is crucial.It will ensure early detection and successful treatment of NCDs and improve health outcomes because of improvement in accessibility.Therefore, this calls for the development of new telemedicine and mobile healthcare technologies and the expansion of primary healthcare services and training programs for healthcare professionals.Enacting policy interventions: To reduce alcohol and tobacco use, it is essential to advocate forcefully for laws and regulations that are supported by research.It includes actions like raising taxes on alcohol and tobacco products, enforcing strict rules on advertising techniques, and launching all-encompassing programs to help people quit smoking.In addition, it is necessary to develop policies that support sustainable food systems, ensuring that wholesome foods, including fruits and vegetables, are widely accessible and economically viable for all socioeconomic groups.